Search CORE

77 research outputs found

Comparing human and automatic thesaurus mapping approaches in the agricultural domain

Author: Caracciolo Caterina
Johannsen Gudrun
Keizer Johannes
Lauser Boris
Mayr Philipp
van Hage Willem Robert
Publication venue
Publication date: 01/01/2008
Field of study

Knowledge organization systems (KOS), like thesauri and other controlled vocabularies, are used to provide subject access to information systems across the web. Due to the heterogeneity of these systems, mapping between vocabularies becomes crucial for retrieving relevant information. However, mapping thesauri is a laborious task, and thus big efforts are being made to automate the mapping process. This paper examines two mapping approaches involving the agricultural thesaurus AGROVOC, one machine-created and one human created. We are addressing the basic question "What are the pros and cons of human and automatic mapping and how can they complement each other?" By pointing out the difficulties in specific cases or groups of cases and grouping the sample into simple and difficult types of mappings, we show the limitations of current automatic methods and come up with some basic recommendations on what approach to use when.Comment: 10 pages, Int'l Conf. on Dublin Core and Metadata Applications 200

arXiv.org e-Print Archive

Proceedings of the International Conference on Dublin Core and Metadata Applications (DCMI)

CiteSeerX

E-LIS

VU Research Portal

SSOAR - Social Science Open Access Repository

Dokumenten-Publikationsserver der Humboldt-Universität zu Berlin

Trusting Semi-structured Web Data

Author: Davide Ceolin
Guus Schreiber
Wan Fokkink
Willem Robert Van Hage
Publication venue
Publication date: 05/03/2020
Field of study

Abstract. The growth of the Web brings an uncountable amount of useful information to everybody who can access it. These data are often crowdsourced or provided by heterogenous or unknown sources, therefore they might be maliciously manipulated or unreliable. Moreover, because of their amount it is often impossible to extensively check them, and this gives rise to massive and ever growing trust issues. The research presented in this paper aims at investigating the use of data sources and reasoning techniques to address trust issues about Web data. In particular, these investigations include the use of trusted Web sources, of uncertainty reasoning, of semantic similarity measures and of provenance information as possible bases for trust estimation. The intended result of this thesis is a series of analyses and tools that allow to better understand and address the problem of trusting semi-structured Web data

CiteSeerX

The possibilities and challenges of using linked data for academic research: the case of the Talk of Europe project

Author: Aggelen A.E. (Astrid) van
Hage W.R. (Willem Robert) van
Hollink L. (Laura)
Kemman M. (Max)
Kleppe M. (Martijn)
Publication venue
Publication date: 01/06/2015
Field of study

CWI's Institutional Repository

A spatial column-store to triangulate the Netherlands on the fly

Author: Alvanaki F. (Foteini)
Hage W.R. (Willem Robert) van
Koutsourakis P. (Panagiotis)
Kyzirakos K. (Konstantinos)
Pereira Goncalves R.A. (Romulo Antonio)
Tilburg T. (Tom) van
Werkhoven B. (Ben) van
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 31/10/2016
Field of study

3D digital city models, important for urban planning, are currently constructed from massive point clouds obtained through airborne LiDAR (Light Detection and Ranging). They are semantically enriched with information obtained from auxiliary GIS data like Cadastral data which contains information about the boundaries of properties, road networks, rivers, lakes etc. Technical advances in the LiDAR data acquisition systems made possible the rapid acquisition of high resolution topographical information for an entire country. Such data sets are now reaching the trillion points barrier. To cope with this data deluge and provide up-to-date 3D digital city models on demand current geospatial management strategies should be re-thought. This work presents a column-oriented Spatial Database Management System which provides in-situ data access, effective data skipping, efficient spatial operations, and interactive data visualization. Its efficiency and scalability is demonstrated using a dense LiDAR scan of The Netherlands consisting of 640 billion points and the latest Cadastral information, and compared with PostGIS

Crossref

CWI's Institutional Repository

Results of the Ontology Alignment Evaluation Initiative 2011 (Final)

Author: Euzenat Jérôme
Ferrara Alfio
Hage Willem Robert van
Hollink Laura
Meilicke Christian
Nikolov Andriy
Scharffe Francois
Shvaiko Pavel
Stuckenschmidt Heiner
Svab-Zamazal Ondrej
Trojahn Cássia
Publication venue: RWTH
Publication date: 01/01/2011
Field of study

MAnnheim DOCument Server

MultiFarm: A benchmark for multilingual ontology matching

Author: Andrei Tamilin
Christian Meilicke
Cássia Trojahn
Elena Montiel-Ponsoda
Euzenat
Euzenat
Fred Freitas
Fu
García-Castro
Giunchiglia
Heiner Stuckenschmidt
Jung
Neches
Niepert
Ondřej Šváb-Zamazal
Raúl García-Castro
Ryan Ribeiro de Azevedo
Shenghui Wang
Vojtěch Svátek
Wang
Willem Robert van Hage
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/2012
Field of study

In this paper we present the MultiFarm dataset, which has been designed as a benchmark for multilingual ontology matching. The MultiFarm dataset is composed of a set of ontologies translated in different languages and the corresponding alignments between these ontologies. It is based on the OntoFarm dataset, which has been used successfully for several years in the Ontology Alignment Evaluation Initiative (OAEI). By translating the ontologies of the OntoFarm dataset into eight different languages – Chinese, Czech, Dutch, French, German, Portuguese, Russian, and Spanish – we created a comprehensive set of realistic test cases. Based on these test cases, it is possible to evaluate and compare the performance of matching approaches with a special focus on multilingualism

VU Research Portal

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

MAnnheim DOCument Server

Archivo Digital UPM

Results of the Ontology Alignment Evaluation Initiative 2007

Author: Euzenat Jérôme
Isaac Antoine
Meilicke Christian
Shvaiko Pavel
Stuckenschmidt Heiner
Sváb Ondrej
Svátek Vojtech
Van Hage Willem Robert
Yatskevich Mikalai
Publication venue: No commercial editor.
Publication date: 11/11/2007
Field of study

euzenat2007gInternational audienceWe present the Ontology Alignment Evaluation Initiative 2007 campaign as well as its results. The OAEI campaign aims at comparing ontology matching systems on precisely defined test sets. OAEI-2007 builds over previous campaigns by having 4 tracks with 7 test sets followed by 17 participants. This is a major increase in the number of participants compared to the previous years. Also, the evaluation results demonstrate that more participants are at the forefront. The final and official results of the campaign are those published on the OAEI web site

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Description of alignment evaluation and benchmarking results

Author: Avesani Paolo
Euzenat Jérôme
Giunchiglia Fausto
Mochol Malgorzata
Shvaiko Pavel
Stuckenschmidt Heiner
Sváb Ondrej
Svátek Vojtech
Van Hage Willem Robert
Yatskevich Mikalai
Publication venue: HAL CCSD
Publication date: 01/01/2007
Field of study

shvaiko2007aNo abstract available

VU Research Portal

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Description of alignment implementation and benchmarking results

Author: Chen Gong
Ehrig Marc
Euzenat Jérôme
Hess Andreas
Hu Wei
Jian Ningsheng
Qu Yuzhong
Stamou Giorgos
Stoilos George
Straccia Umberto
Stuckenschmidt Heiner
Svátek Vojtech
Troncy Raphaël
Valtchev Petko
Van Hage Willem Robert
Yatskevich Mikalai
Publication venue: HAL CCSD
Publication date: 30/12/2005
Field of study

stuckenschmidt2005aThis deliverable presents the evaluation campaign carried out in 2005 and the improvement participants to these campaign and others have to their systems. We draw lessons from this work and proposes improvements for future campaigns

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server